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Abstract 



Tuffley and Steel (1997) proved that Maximum Likelihood and Maximum Par- 
simony methods in phylogenetics are equivalent for sequences of characters under a 
simple symmetric model of substitution with no common mechanism. This result 
has been widely cited ever since. We show that small changes to the model assump- 
tions suffice to make the two methods inequivalent. In particular, we analyze the 
case of bounded substitution probabilities as well as the molecular clock assump- 
tion. We show that in these cases, even under no common mechanism, Maximum 
Parsimony and Maximum Likelihood might make conflicting choices. We also show 
that if there is an upper bound on the substitution probabilities which is 'sufficiently 
small', every Maximum Likelihood tree is also a Maximum Parsimony tree (but not 
vice versa). 

Keywords: phylogenetics, maximum parsimony, maximum likelihood, molecular 
clock 
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1 Introduction 



Stochastic models for nucleotide substitution and tree reconstruction methods for infer- 
ring phylogenetic trees are used to interpret the ever-growing amount of available genetic 



sequence data. Unsurprisingly, such models and methods have there: 
cussed in the 



200^ ■. 



Yang 



ast d ecades (e.g., |Felsenstein 



1978 



Felsenstein 



b re been widely dis- 



2nn4j : [Semple and Steel 



2006| ). Two of the most frequently used tree reconstruction methods are 



Maximum Parsimony (MP) and Maximum Likelihood (ML). A basic difference between 
these two methods is that MP, unlike ML, is not based on a specific nucleotide substi- 
tution model. If the sequences under consideration are related by a specifi c model of 



substitution, the results of MP and ML may coincide Hendy and Penny 



1989|, but there 



are also examples , such as the famous 'Felsenstein Zone', for which this is not the case 



Felsenstein 



1978 |. 



In 1997, Tuffley and S teel took an important step forward in the analysis of MP and ML 



Tuffley and Steel 



1997l | : they showed that a particular symmetric model of substitution 
with 'no common mechanism' is sufficient for MP and ML to be equivalent when applied 
to a sequence of characters. 

The purpose of this paper is to analyze this equivalence of MP and ML further by 
considering slightly modified model assumptions that are of biological relevance. For 
instance, MP is often as sumed to be Justifi ed whenever the nucleotide substitution proba- 



bilities are small (e.g.. 



Felsenstein 



2004| . p. 101). Therefore, we restrict the model by 



placing an upper bound on these probabilities, and find that under no common mechanism 
MP and ML are no longer equivalent. Moreover, the equivalence of MP and ML under a 
'no common mechanism model' also fails under the constraint of a molecular clock, even 
without a bound on the substitution probabilities. These two claims will be established 
by constructing counterexamples that are minimal with respect to the number of taxa. To 
construct our examples, we exploit a useful property of the likelihood function for a 'no 
common mechanism' model, namely that it is multilinear in the substitution probabilities. 
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This fact underlies Equation 18 and Lemma 2 in Tuffley and Steell . Il997l |. which we use 
in our arguments. 

We then go on to prove bounds on the probability of observing a given sequence of 
characters on a tree, and use them to show that it is possible to choose sufficiently small 
substitution probabilities (depending on the number of taxa, the number of characters 
and the number of states) so that every tree chosen by ML is also a most parsimonious 
tree. 



2 Notation and Model Assumptions 

Recall that a phylogenetic X-treeis a tree T = {V{T), E{T)) on a leaf set X = {1, . . . , m} C 
V{T) with no vertices of degree 2. Note that the tree does not have to be binary. Further- 
more, recall that a character / is a function f : X ^ C for some set C := {ci, C2, C3, . . . , c^} 
of r character states (r G N). An extension of / to V{T) is a map g : V(T) C such that 
g{i) = f{i) for all i in X. For such an extension g of /, we denote by Wig) the number of 
edges e = {u,v} in T on which a substitution (mutation) occurs, i.e. where g{u) 7^ g{v). 
The parsimony score of / on T, denoted by lr{f), is obtained by minimizing lr{g) over 
all possible extensions g. The parsimony score of a sequence of characters S := fif2 ■ ■ ■ fn 

n 

is given by IriS) = hU'i)- 

i=l 

Recall that a character / on a leaf set X is said to be informative (with respect to 
parsimony) if at least two distinct character states occur more than once on X. Otherwise 
/ is called non-informative. Note that for a non-informative character /, Wiif) = ^T^if) 
for all trees 7^, Tj on the same set X of leaves. 



Next we describe the fully symmetric r-state model Neyman 



197l| . also known as 



the A'^r-model, which underlies the Tuffley and Steel equivalence result. 

Consider a phylogenetic X-tree T arbitrarily rooted at one of its vertices. The A^^ 
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model assumes that a state is assigned to the root from the uniform distribution on the 
set of states. The state then evolves away from the root as follows. The model assumes 
equal rates of substitutions between any two distinct character states. For any edge 
e = {u, v} G E{T), where u is the vertex closer to the root, let pe denote the conditional 
probability P{v = Ci\u = Cj), where q 7^ Cj. The probability Pe is equal for all pairs 
of distinct states Cj and Cj. Therefore, the probability that a substitution {cj to a state 
different from cj) occurs on the edge e is (r — l)pe- Let qe be the conditional probability 
P{v = Ci\u = Cj), i.e. the probability that no substitution occurs on edge e. In the 
Nr-raodel, we have < Pe < ^ for all e G E{T), and (r — l)pe + gg = 1. Moreover, the 
Nr-Taode\ assumes that substitutions on different edges are independent. Not e that for 



r = 4, the TV^-model coincides with the Jukes-Cantor model 



Jukes and Cantor 



1969|. 



Let T be a phylogenetic X-tree and let / be a character on its leaf set X. Let the 
substitution probabilities assigned to the edges of T under the X^-model be collectively 
denoted by p := {pe '■ e & E{T)). Then we denote by P{f\T,p) the probability of 
observing character / given tree T and the parameter values p. Note that P{f\T,p) does 
not depend on the root position, since the model is symmetric. The maximum value of this 
probability for fixed / and T as p ranges over all possibilities is denoted by maxP(/|T), 
i.e. maxP(/|T) := maXpP{f\T,p). 

Now let S* := /i, ...,/„ be a sequence of characters. In this paper, we analyze se- 
quences of characters under the X^-model with no common mechanism. This means that 
the substitution probabilities on edges may be different for different characters in S with- 
out any correlation between the characters. We suppose that for each character /j in the 
sequence and for each edge e of the tree, there is a parameter pe j that gives the substi- 
tution probability for fi on edge e, and that the parameters pe,i are all independent. For 
i = 1, ... ,n, let Pi := (pe,j : e G E{T)) be the vectors of substitution probabilities. We 
denote the model parameters = 1, . . . ,n) collectively as 6. Then the probability of 
observing the sequence of characters S on tree T for the given parameters 6 is given by: 
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P{S\T,Q) = YlP{f,\r,p,), 

i=l 

which follows from the fact that the characters are independent. 

We refer to P{S\T,Q) as the probability of observing sequence 5* given the phylo- 
genetic tree T and model parameters 6. We then define the likelihood of the tree T 
and the model parameters given the sequence S, which we refer to as the likelihood 
function, as L{T,Q\S) := P{S\T,Q). The maximum likelihood method of phylogenetic 
tre e reconstruction involves optimizing the likelihood function in two steps as described 



m 



Semple and Steel 



20031]. We first maximize P{S\T,Q) over the space of model pa- 
rameters 6. We define: 

maxP(^|T) := maxP(5|r, 0). 

We then choose a tree T that maximizes max P{S\T). We call such a tree a maximum like- 
lihood tree (ML-tree) of S. Thus, an ML-tree of a sequence 5* is argmax-j- (maxP(S'|T)). 
Note that under the assumption of no common mechanism, we have: 



maxP(5'|T) = ]^m_axP(/i|T,pj 



1=1 



3 Results 



Using the notation introduced in the previous section, we are now in a position to state 
the equivalence result of Tuffley and Steel explicitly. 

Theorem 3.1. (Tuffley and Steel 1997). Let T be a phylogenetic X-tree and let 
S := fi, . . . , fn be a sequence of r -state characters on X . Then, under the Nr-model with 
no common mechanism, we have: 
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Thus ML and MP both choose the same tree(s). 

In the following, we show that small changes to the assumptions of the iV^-model may 
be enough to make this equivalence fail. In particular, we analyze two settings of biological 
interest: first, we consider bounded substitution probabilities; secondly we investigate the 
case of a molecular clock. In both cases, we explicitly construct examples in which MP 
and ML choose different sets of trees under no common mechanism. 

3.1 Bounded substitution probabilities 

In this section, we consider a modification of the Nr-v[vo(\e\ in which the substitution 
probabilities on all edges are bounded above by some u < K We construct character 
sequences for which MP and ML choose different sets of trees. 

Proposition 3.2. Under the Nj.-model with no common mechanism, for r > 2, there 
exist values of u such that if the substitution probabilities are bounded above by u, MP and 
ML choose different sets of trees. In particular, we have: 



1. For r = 2, for all values of u & (^0, 1 — -^J, there exist sequences of characters for 
which MP and ML choose different sets of trees. 

2. For r > 2, for all values of u E (0, ^) there exist sequences of characters for which 
MP and ML choose unique and distinct trees. 

In order to prove this proposition, it is necessary to summarize the main idea of the 
original proof of the Tuffley-Steel result. We state it here in a more general form so that it 
may be used to analyze the situation in which the substitution probabilities are bounded. 

Lemma 3.3. Let T be a phylogenetic X-tree and let f be a character on X. Then 
under the Nj--model with all substitution probabilities bounded by u, where < n < the 
probability P{f\T,p) can be maximized at a point where all substitution probabilities are 
either or u. 
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Lemma 13.31 is the same as Lemma 2 in [Tuffley and Steell . Il997l | except that Tuffley 
and Steel stated their result only for u = K However, this assumption is not used in 
their proof and is therefore not required for the lemma to hold. Tuffley and Steel used it 
to explicitly maximize the probability of observing a character on a given tree under the 
A'^r-model: for a given character / and tree T with a most parsimonious extension g of /, 
assigning substitution probability ^ to edges where a substitution is i nduce d by g, and 



elsewhere, gives maxp P{f\T,p) (cf. Theorem 3 of [Tuffley and Steel 



19971). 



But it turns out that an ML solution cannot be similarly related to an MP solution 
when u < K That is, if is a most parsimonious extension of a character /, then we may 
not be able to maximize the probability by simply assigning the substitution probability 
u to edges on which there is a substitution in g, and to edges on which there is no 
substitution in g. The probability may actually be maximized at some other corner of 
the feasibility region of p. This is the idea of the following construction. 



Proof of Proposition \3.2[ We provide examples of sequences of characters for which MP 
and ML may choose different sets of trees. We first prove the case r = 2 with an example 
on five taxa, and show that in this case, there are no such examples on fewer than five 
taxa. Then we explicitly prove the case r = 3 with an example on four taxa and show 
how this example can be generalized for r > 3. 

Case r = 2: 

Let the set of character states be {a, (3}. Consider the two trees Ti and 7^ shown 
in Figure [H alongside the characters /i = aa[3j3j3 and /2 = aj3aj3[3. We consider the 
character sequence S := /i/2. 

Note that IrAfi) = lrAf2) = 1 and /r,(/2) = IrAh) = 2- Therefore, WAS) = WAS) = 
3, which means that MP will not favor either of the two trees 7^, 7^ over the other one. 
Moreover, as /i and /2 are incompatible with one another, it can easily be seen that both 
trees are actually MP-trees: the minimal score of either character is 1, as two states are 



7 



Equivalence of MP and ML 




employed, and this score is achieved when the character corresponds to a split on an edge 
of the underlying tree - but because of the incompatibility, the other character will have 
a score of at least 2. So for S, a score of 3 is best possible, and thus both Ti and T2 are 
MP-trees. 

For ML, the situation is different. This is because the assignments of /i on T2 and 
/2 on Ti differ, as highlighted by Figure [H In fact, character fi has a unique most 
parsimonious extension on 7^, whereas /2 has two most parsimonious extensions on T[. 
As we show in the following, for a sufficiently small upper bound u, the likelihood function 
is maximized when these extensions both contribute to the likelihood. We use a symbolic 
algebra system to evaluate P{fi\T,p) for i = 1,2, for all trees on five taxa and at all 
corners of the feasibility region of p (see Lemma [3.31) . More specifically, for the five- leaf- 
trees under investigation, there are seven edges to which either or n can be assigned, 
which gives 2^ = 128 possible parameter vectors p at which the likelihood might be 
maximized. We observe that max P(/i| 7^) = maxP(/2|7^) = but maxP(/i|7^) = ^u^ 
and maxP(/2|Ti) = max(iM^, m^(1 — m)^). So there are choices of m, namely all m < 1^:^; 
for which maxP(/i|72) < maxP(/2|TL). In these cases, even though both Ti and T2 are 
MP-trees, ML will favor tree Ti over 7^. Therefore, MP and ML are not equivalent in 
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this case. 

Now let sequence S contain n copies of character fi and n + 1 copies of character /2 
for some integer n > 0. Then, clearly Iti{S) = 3n + 2, but ^('S') = 3n + 1. Therefore, 
MP will favor tree T2 over Ti. Moreover, T2 is an MP-tree (by the same incompatibility 
argument concerning fi and /2 as above). On the other hand, we have maxP(iS'|7^) = 
■ - n)^)""^^ and maxP(S'|7^) = (provided u < 1 - -^). We choose a 

sufficiently large value of n so that the former value is larger than the latter. For such 
choices of n, ML will favor tree 7j over 7^, even though MP favors 7^. It is important to 
note, however, that for the sequence S, the tree Ti is not an ML-tree. It can be easily 
verified for the tree % in Figure [2] that maxP(5'|Tj) = (|)"^^ (u^(l — u)^)", which is 
more than maxP(S'|TL). In fact, maxP(S'|73) > maxP(iS'|7i) for all u < |. In fact, 
further work shows % is the unique ML-tree. Moreover, is also an MP-tree. So for 
r = 2, it remains unclear whether MP and ML can make strictly confiicting choices. 




3 2 5 

Figure 2: Tree T3 is both an MP- and an ML-tree for sequence S 

Note that when r = 2, examples demonstrating the inequivalence of MP and ML 
cannot be constructed with fewer than five taxa. This is because given at most one 
interior edge, it can be easily checked that all non-informative binary characters have the 
same maximum probability on all trees, whereas informative binary characters on four 
taxa have a higher probability on the tree where they have parsimony score 1 (calculation 
not shown). 

Case r = 3: 
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Let the set of character states be {a,/?, 7}. We consider four taxa and the characters 
fi := aaPP and /2 := aP'jP, as well as the sequence S of characters defined by S := 
/i /2 • • • /2- Two of the three possible trees on four taxa are shown in Figure [31 the tree 

n times 

7; = 12134 and the tree % = 13124. 







(a) (b) 

Figure 3: Tree 74 illustrated in (a) is the unique MP-troe for 5, whereas (b) depicts tree Ts, whieh is the unique ML-tree 
for S when n is chosen sufficiently large. 



Tree 74 is clearly the unique MP-tree of 5, as the only informative character in S is 
fx = aaPP. 

The ML-trees are obtained as described at the end of Section [21 As before, we used 
a symbolic algebra system to evaluate P{f\T,p) for all characters / in the sequence, for 
all trees on four taxa and at all corners of the feasibility region of p (see Lemma [3. 3p . We 
observed that maxP(/2|74) = and maxP(/2|75) = m^(1 — 2m). Therefore, for all u < ^, 
we have maxP(/2|7^) > maxP(/2|7^). Now for any n < 1, a sufficiently large value of n 
may be chosen such that ™^^p|gj^^| > 1. We do not analyze the character /i, although 
the actual choice of n will depend on the ratio ""^^ d/{^!I^^n and on u. Therefore, MP and 

f maxP(/i|T4) ' 

ML choose different trees in this three-state setting. Moreover, it turns out that for the 
third topology on four taxa, namely Tg = 14|23, we have maxP(S'|76) < maxP(S'|75) for 
all choices of n < 1 (calculations not shown). So, % is the unique ML-tree, whereas 7^ is 
the unique MP-tree in this setting. So MP and ML make strictly conflicting choices. 

Case r > 3: 

Let the set of states be C := {a, /5, 7, ^2, . . . , ^r-s}- Let V := {61,62, ■■■ ySrs}- 
We analyze four taxa and the same characters /i := and /2 := ajS'-flS that were 

analyzed in the case r = 3, but this time under the A^^-model with r > 3. Again we 
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consider the sequence of characters S := fi f2 ■ ■ ■ f2- 

n times 

We only sketch the proof in this case. In particular, we indicate how the expressions 
for the likelihood function may be written regardless of the number of states. 

The expressions for P(/j|TJ) for i = 1,2 and j = 1,2,3 can be written in a simple 
manner since the states 6i do not occur in 5*. For example, let the substitution probabilities 
on the edges of a four-taxa tree T he p = {pi, i = 1,2, ... ,5), where pi,i = 1,2,3,4 are the 
substitution probabilities on the pending edges adjacent to taxa 1, 2, 3, 4, respectively, 
and p5 is the substitution probability on the internal edge. Let v and w be the internal 
vertices of T. We write P{fi\T,p) = ^gP{g\T,p), where the summation is over all 
extensions g of /j. 

Now observe that if g and h are two extensions of either /i or /2, then we have 
P{g\T,p) = P{h\T ,p) if g{v), h{v) G V and g{w) = h{w) = s (or vice versa with the 
roles of V and w interchanged) . 

Therefore: 

J2 P{g\T,p) = {r-3)P{h\r,p), 

g:g{v)eV,g{w)=s^V 

where h is an extension of / for which h{v) = 6i and h{w) = s. 
Similarly: 

J2 P{g\T,p) = {r-3)P{h\T,p), 

g:g{v)=s<^V,g{w)eV 

where h is an extension of / for which h{v) = s and h{w) = 6i. 
Finally: 

P{g\^,P) = (r - 3)(1 - 3p5)piP2P3PA, 

g:giv)eV,g{w)eV 

With these observations, it is possible to write the expressions for computing P(fi\Tj) 
in a computer algebra system. As in the case r = 3, we analyzed only P(/2|7^) and 
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P(f2\%), and verified that maxP(f2\%) > " (^^^^") g^^id raaxP{f2\T^) = Since (3 — 



2ru) > 1 for all u < i, there is an n for which maxP(5'|7^) > maxP(5'|7^). This means 
that ML will favor % over 74, even though is the unique MP-tree in this setting. □ 

Remark 1. It is important to state that in the examples for r > 3 introduced in the 
proof of Proposition 13.21 where the number of taxa is bounded (in fact, it is only 4), as u 
approaches ^, we require n to tend to infinity for ML and MP to make different choices. 
However, this is a necessary property of any such example for which the number of taxa 
is bounded: For any fixed character sequence S, the continuity of the likelihood function 
and the Tuffley-Steel result (Theorem 13. ip imply that there is a positive real number 
e{S) such that if m > ;^ — e{S), then ML and MP choose the same sets of trees. Therefore, 
for a bounded number of taxa, since there are only finitely many sequences of length at 
most k, we set e := min5(e(S')), where the minimization takes place over all character 
sequences of length at most k, and conclude that MP and ML would be equivalent (in 
the sense of the Tuffley-Steel result) for all m > ^ — e, for all sequences of length at most 
k. Therefore, as u approaches ^, the sequence length n of sequences for which MP and 
ML make conflicting choices has to tend to infinity. 

We now complement the above inequivalence results by showing that for sufficiently 
small choices of u, all ML-trees are also MP-trees. To prove this result, we first establish 
lower and upper bounds for the maximum probability of observing a character given a 
tree. 

Proposition 3.4. Let T he a phylogenetic X-tree, where \X\ = m. Let f he a character 
on X. Let < m < ^. Then under the Nr-model with all suhstitution prohahilities hounded 
hy u, we have 



Proof. For the lower bound, just as in the Tuffley-Steel approach explained above, we 
take a most parsimonious extension g of f and assign substitution probability u to each 




Irif) 
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edge that has a substitution in g, and to all other edges. Considering the r possible root 
states (for an arbitrarily chosen root), this gives the lower bound for maXpP{f\T,p). 

To prove the upper bound, we observe that there are exactly r™"^ extensions of /, 
where m — 2 is the number of internal vertices, each of which may be assigned any of the r 
states. We will now analyze these extensions. Let g be any extension of /. For substitution 
probabilities pe G {0,n} assigned to the edges of the tree, the value of P{g\T,p) for an 
assignment of probabilities that maximizes P{f\T,p) is either (if one of the edges where 
there is a substitution in g has been assigned a substitution probability 0) or is given by 

P(g\T,p) = -u'^'il - (r - l)u)^^ < -M^-i < -m'^(^), (1) 

/y ry ly 

where ki > lr{f) is the number of edges where there is a substitution in g, and k2 is the 
number of edges which require no substitution in g but have been assigned substitution 
probability u. The factor ^ is caused by the r different possible choices for the root state. 

The upper bound now follows by summing the probabilities of all extensions. □ 

Now we will use the above bounds to derive the desired conclusion on ML-trees. 

Theorem 3.5. Let Ta and % be two phylogenetic X-trees, where \X\ = m, and let 
S := /i,/2, •••,/« be a sequence of characters on X. Let the substitution probabilities 
on all edges of Ta and % be bounded by u < r^^"™-)". Then under the Nr-model with no 
common mechanism, we have: 

hiS) < IrSS) ma.xP{S\%) > maxP(S|r,). 

Proof. By Proposition 13. 4^ we have: 

maxP(^|7;) < r(™-3)"nS^'^»(-^») (2) 
and 
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max P{S\rb) > (^^j wE.;r,a)_ (3) 

Note that for any positive integers a and b such that b < a and any positive constant 
c, for sufficiently small values of u, we have < cu'^. Now let b := Yli^Ttifi) 
« := and c := ^^r+r^ = Then Equations © and © imply that 

maxP{S\%) > maxP{S\ra). □ 

The following corollary directly follows from the above theorem. 

Corollary 3.6. Let S be sequence of n characters on a set of m taxa. Then there is an 
e = e{m,n,r) such that under the Nr-model with no common mechanism and with all 
substitution probabilities subject to an upper bound u G [0,e), all ML-trees of S are also 
MP-trees. 



3.2 Molecular clock 

We now prove a statement similar to Proposition 13.21 but with substitution probabilities 
which conform to a molecular clock. Moreover, we consider only the three-state symmetric 
model. Under the Ai's-model, we consider placing a bound Pmax on the probability of each 
particular substitution from the root to any leaf. The value Pmax = | means we place no 
bound beyond that already in the A'^s-model, while Pmax < | limits the tree depth. 

Proposition 3.7. Under the N^-model with no common mechanism, with the substitution 
probabilities constrained by a molecular clock, MP and ML are not equivalent for any bound 

Pmax £ [0) 3] ■ 

Proof. Consider the two rooted four-taxa trees Ti and T2 along with substitution proba- 
bilities Pi and Pi, respectively, on their edges as shown in Figure |H The trees have the 
same shape but different leaf labels, and possibly different probabilities of a substitution 
from the root to any of its leaves. Under a molecular clock, we have pi = p2 and p^ = p^ in 
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7i, and pi = ps and p2 = p4 in 72. Let p,p E [0, Pmax] be the probabilities of a substitution 
from the root p to any leaf in Ti and 72, respectively. 

Then under the Ai's-model, we write p and p in terms of the substitution probabilities 
on the edges of the trees as follows: 

p = (1 - 2p5)pi + P5(l - 2pi) + p^pi =Pi+P5- 3piP5 

= (1 - 2pg)p3 + P6(l - 2p3) + = P3 + P6 - ^PsPG- 



Thus p5 = and pe = ^3^. Similarly, on 7^, we have p^ = and pe 



P-P2 
l-3p2 ■ 



To 





/i: 

Figure 4: Rooted binary trees Ti and T2, which conform to a molecular clock, and the assignment of characters /i = aa[3f3 
and /2 = a/Bjf^. 

As in the proof of Proposition 13.21 we consider the iVs-model with state space C : = 
7}. Consider the characters /i := aa/SP and /2 := aP'jP, and a sequence of 
characters S := /i /2 • • • /2, where n is a positive integer. 

n times 

As before, Ti is the unique MP-tree of S. We claim that Ti is not an ML-tree if n is 
sufficiently large. In order to show this, we show that maxP(S'|72) > maxP(S'|7i) for a 
suitable choice of n. 



We have 



maxP(g|T2) ^ (maxP(/i|T2))(maxP(/2|T2))" 
maxP(5|Ti) ~ (maxP(/i|Ti))(maxP(/2|Ti))-' 
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We now demonstrate that max P{f 2\T2) > maxP(/2|7i) for all values of Pmax- This 
allows us to choose a sufficiently large value of n so that the ratio above is more than 1. 

First we seek to maximize: 



P{MT,,p) = 5^P(/2|Ti,p,p = c)P{p = c) = ^-J2PiMTi,p, 



P = c). 



Using a computer algebra system, we expand the right-hand side of this equation by 
summing the probabilities over all possible assignments of states to the internal nodes 5 
and 6, and substitute = Pe = f^i^^ to obtain: 



, I . PiP3{SpiP3 - 2pi - 2p3 + l + 2p- 3p2) 
P{j2\Ti,p) = 



Observe that for any fixed values of pi + p^ and p, the expression above is maximized 
when pip^ is maximized, i.e. when pi = ps. Therefore, we can substitute pi for p-^ and 
maximize the resulting expression given by: 



pj{l -p-pi){l + 3p- 3pi] 



Under the constraint p G [0,pmaa;], straightforward arguments show that the expression 
shown above has a maximum at pi = p = Pmax- Therefore: 



maxP(/2|ri) = ^^°"^^ ^^'"""^ (4) 



Similar calculations show that: 



maxP(/2|7^) = - "^Pmax), (5) 
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where the maximum is obtained by setting P2 = P4 = P5 = and Pi = Ps = Pe = p = Pmax- 

Equations (jlj) and ^ imply that maxP(/2|'7^) > maxP(/2|7^) for all Pmax ^ (0, |]. 
Now we can select a sufficiently large value of n so that maxP(S'|7i) < maxP(S'|72), 
where the actual choice of n will depend on the ratio maxP(/!|ri) ^^'^ Pmax- 

This analysis does not show that T2 is an ML-tree, but it shows that Tl, which is a 
unique MP-tree, is not an ML-tree. Therefore, the two methods are not equivalent under 
the constraint of a molecular clock, even when we assume no common mechanism. □ 



4 Discussion and Outlook 

Our main objective was to present examples of sequences of characters for which MP and 
ML with no common mechanism may choose different sets of trees under the Nr-vnodeX 
when the substitution probabilities are bounded above by m < ^ or when a molecular 
clock is assumed. Our four-taxa examples with r > 3 character states shows that even if 
the upper bound u is arbitrarily close to ^, we can find sequences of characters which are 
sufficiently long to cause MP and ML to make conflicting choices. 

The motivation for our four-taxa examples came from the idea of the so-called 'mis- 
leading sequences', which are sequences for which the (parsimoniously) perfect phylogeny 
(i.e. a tree on which the whole sequence is completely homoplasy-free ) and the tree on 
which the derived Hamming distan ces are additive differ (for details, see 



200 



Bandelt and Fischer 



Huson and Steel 



20081]). Even though this discrepancy refers to perfect phylo- 
genies (as opposed to general MP-trees), we used a similar idea to construct our four-taxa 
examples. In particular, the idea underlying the construction of our sequences is based 
on the fact that MP ignores parsimoniously non-informative characters in any sequence, 
whereas ML (just as distance-based methods) does not. We exploited this fact to cause 
a discrepancy between MP and ML by taking sufficient non-informative characters. 
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It has been known that there are no binary 'misleading sequences': if a set of binary 
characters is convex on a binary phylogenetic tre e, then the Hamrning di stances of this 



Semple and Steel . 



2003j, Prop. 7.1.9). 



sequence are a tree metric on the same tree (see 
But for MP and ML, it is still unknown if there is a sequence of binary characters for which 
these methods make conflicting choices when the substitution probabilities are bounded 
above hj u < K We looked at binary characters on five-taxa trees and found character 
sequences for which some MP-trees are not ML-trees, but we observed that the ML-trees 
in our examples were also MP-trees - which means that the equivalence of MP and ML 
failed. But we did not find an example of strictly conflicting choices in the binary case. 
Also, we did not flnd examples of sequences for which MP and ML are not equivalent for 
values of u which are arbitrarily close to i. Thus, it would be interesting to analyze two- 
state models further to decide if all ML-trees are MP-trees and if the equivalence between 
MP and ML under no common mechanism can fail for values of u that are arbitrarily 
close to |. 

MP is traditionally assumed to be justifled (in the sens e of agreement wit h ML) when- 



2004). Therefore, 



ever substitution probabilities are small (see, for example, |Felsensteinl . 
our result, which shows that an upper bound on the substitution probabilities can make 
the equivalence of MP and ML fail under the A^^-model with no common mechanism, 
is particularly surprising. On the other hand, we have shown that for sufficiently small 
choices of the upper bound, all ML-trees are at least also MP-trees (but not vice versa). 
So in summary, although MP has been proven to agree with ML in the A^j.-model under 
the assumption of no common mechanism (and under no further constraints), our exam- 
ples show that this equivalence may fail when the model is changed slightly. Therefore, 
we conclude that neither the presence nor the absence of a common mechanism alone can 
justify MP in the sense of an MP-ML equivalence. More research could be done on other 
models of nucleotide substitution in order to analyze conditions under which ML and MP 
may give conflicting results. This might highlight even more differences between MP and 
ML. 
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